Automatic Identification of Zero Pronouns and their Antecedents within Aligned Sentence Pairs

نویسنده

  • Hiromi Nakaiwa
چکیده

This paper proposes a method to identify zero pronouns within a ~]apansse sentence and their antecedent equivalents within the corresponding English sentence from aligned sentence pairs. The method focuses on the characteristics of Japanese and English, in two languages from cHfBerent f~rngles and in which distribution of zero pronouns is very d.uTerent. In this method, the Japanese sentence and English translation within the Japanese and English aligned sentence pairs are analyzed. Then, the pairs of Japanese word/phrase and their English equivalent word/phrase are identified from each aligned sentence pair. Next, zero pronouns within a Japanese sentence are identified by using the syntactic and semantic structure of the Japanese sentence and their antecedents within the English sentence are identified by using the characteristics of anaphoric and deictic expressions in English. This method was implemented using the Japanese-to-English machine translation system, ALT-3/E for the analysis of Japanese sentences and Brill's tagger for the analysis of the English sentences. According to my evaluation, for 554 zero p~onouns in a sentence set for the evaluation of 3apanese-to-Engllsh machine translation systems, 91.5% of the pairs of zero pronouns in the Japanese sentences and their antecedents in the English translations were automatically identified correctly. 1 ! n / ; r o d u c t i o n 1.1 M o t i v a t i o n In natural languages, elements that can be easily deduced by the reader are frequently omitted from expressions in texts (Kuno, 1978). This phenomenon causes considerable problems in natural language processing systems. For example, in a machine translation system, the system needs to recognize those elements which are not present in the source language, but may become mandatory elements in the target language. In particular, the subject and object are often omitted in Japanese; whereas they are normally obligatory in English. Thus, in Japanese-to-Engiish m a ~ n e translation systems, it is necessary to identify case elements omitted from the original Japanese ("zero pronouns") for their translation into Engiish expressions. Several algorithms have been proposed with regard to this problem (Kameyama, 1986; Walker et al., 1990; ¥oshimoto, 1988; Dousa~a, 1994). V/hen considering the application of these methods to a practical machine translation system for which the translation target area can not be 1/mired, it is not possible to apply them directly, both because their precision of resohtion is low as they only use limited information, and because the volume of knowledge that must be prepared beforehand is so large.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Extraction Of Rules For Anaphora Resolution Of Japanese Zero Pronouns From Aligned Sentence Pairs

This paper proposes a method to extract rules for anaphora resolution of Japanese zero pronouns from aligned sentence pairs. The method focuses on the characteristics of Japanese and English in which both the language families and the distribution of zero pronouns are very different. In this method, zero pronouns in the Japanese sentence and the English translation equivalents of their antecede...

متن کامل

Automatic Detection of Antecedents of Japanese Zero Pronouns Using a Japanese-English Bilingual Corpus

In this paper we present a method of detecting zero pronouns in Japanese clauses and identifying their antecedents using aligned sentence pairs from a Japanese-English bilingual corpus and open resource tools. We use syntactic and semantic structures and the alignment of words and phrases in the sentence pairs to automatically detect zero pronouns and determine their antecedents using English t...

متن کامل

Identif icat ion of Zero Pronouns and their Antecedent s within Al igned Sentence Pairs

This paper proposes a method to identify zero pronouns within a ~]apansse sentence and their antecedent equivalents within the corresponding English sentence from aligned sentence pairs. The method focuses on the characteristics of Japanese and English, in two languages from cHfBerent f~rngles and in which distribution of zero pronouns is very d.uTerent. In this method, the Japanese sentence an...

متن کامل

Anaphora Resolution of Japanese Zero Pronouns with Deictic Reference

This paper proposes a method to resolve the reference of deictic Japanese zero pronouns which can be implemented in a practical machine translation system. This method focuses on semantic and pragmatic constraints such as semantic constraints on cases, modal expressions, verbal semantic attributes and conjunctions to determine the deictic reference of Japanese zero pronouns. This method is high...

متن کامل

Zero Pronoun Resolution in Thai: A Centering Approach

Since pronouns can be dropped in Thai, a natural language processing system for Thai must be able to resolve referents of the missing pronouns. One of several approaches that have been used for reference resolution is Centering Theory. Centering Theory is a focusing process in which salience of discourse entities is being kept track of. Referents of pronouns or zero pronouns are usually entitie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997